We follow Silva (###) approach of calculating and decomposing yield gaps in agriculture. In this approach, yield gaps are decomposed into technology yield gaps, resource yield gaps and efficiency yield gaps.
We address two challenges and potential advancements that have been identified in the literature on this approach, (i) granular decomposition of yield gaps beyond technology, resource, and efficiency yield gaps to actually consider the contribution of management and policy relevant variables like early sowing, and weed management by decomposing the efficiency yield gaps, and contribution to reducing inputs through decomposition of the resource gaps, and (2) spatially granular decomposition of yield gaps by demonstrating in which location or grid is each of the management practices more relevant to close the yield gaps.
In this workbook we will a recent approach of estimating marginal impact on efficiency for each of the efficiency variables at spatially disaggregated level and demonstrating that this approach can allow one to explore the sources of inefficiency at a spatially disaggregated level.
This then allow one to decompose the efficiency yield gap into policy and action oriented components for which farmers and other stakeholders can attempt to make adjustments. We also follow Pross et al () in identify resource yield gaps and identifying which input usage can be reallocated. We do these decompositions at a spatially disaggregated level thereby allowing one to identify if policies related to say subsidies or input assistance can help in increasing yields or whether other management mechanisms (for example split application) can be more helpful in a particular location.
On understanding which aspects of the efficieny yield gaps should be prioritized where, we decompose these into genotype, environment, management and socioeconomics (GEMS).
2 Conventional Yield Gap Decomposition (following on Silva et al)
# Yield gap in tons per haeff_d$yield_gap_t_ha <- eff_d$max_eff_yield - eff_d$yield_t_haggplot(data = eff_d) +geom_histogram(aes(x = yield_gap_t_ha), bins =20) +labs(title ="Efficiency yield gap(t/ha)", x ="Efficiency yield gap(t/ha)") +theme_bw()
Code
# Yield gap reduction due to early sowing in tons/haeff_d$early_sowing_yield_gap_redn <- eff_d$yield_gap_t_ha * eff$efficiencyggplot(data = eff_d) +geom_histogram(aes(x = early_sowing_yield_gap_redn), bins =20) +labs(title ="Early sowing effect (t/ha)", x ="Early sowing effect (t/ha)") +theme_bw()
Code
# Input savings due to early sowing in tons per ha
3 Geoadditive stochastic frontier analysis
3.1 Estimation using thin plate spline
Code
# LDSestim_gam=LDSestim %>% drop_na()# Write formulae for parameters for the mean model, sigma model, and inefficiency model# We use the original functional forms: identity for mean model, log sigma for sigma model and log ineff for the inefficiencymu_formula <-log(L.tonPerHectare)~Nperha + wc2.1_30s_elev +s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude,by = Weedmanaged)sigma_v_formula <-~1+ temp + precipsigma_u_formula <-~1+s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude, by = Sowing_Date_Early) +s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude,by = Weedmanaged)s <--1# production function specification# Fit model# If using older versions of R you can use the following code# model <- mgcv::gam(# formula = list(mu_formula, sigma_v_formula, sigma_u_formula),# data = LDS_wheat_public_cleaned, family = normhnorm(s = s), optimizer = c("efs")# )model_dsfa<-dsfa(formula=list(mu_formula, sigma_v_formula, sigma_u_formula),data=LDS_wheat_public_cleaned, family=comper(s=s, distr="normhnorm"), optimizer =c("efs"))summary(model_dsfa)
pred <-SpatialPoints(elevationglobal_geodata_aoi)@coordspred <-as.data.frame(pred)names(pred)[1:2] <-c("lon", "lat")# pred <- expand.grid(lon = seq(82, 89, length = 100),lat = seq(24,28, length = 100))effdt_battese_hat <-predict(b, newdata = pred)effdt_battese_hat <-as.data.frame(effdt_battese_hat)pred_effdt_battese_hat <-cbind(pred, effdt_battese_hat)pred_effdt_battese_hat$sigma <-NULLlibrary(terra)myras <-rast(pred_effdt_battese_hat, type ="xyz")plot(myras)
Code
library(raster)myras2 <-raster(myras)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2 <-mask(myras2, India_aoi_sf_dis_sp)plot(myras2, main ="Gridded efficiency scores")
Code
# Estimate the efficiency yield gap (%)effic_yield_gap_perc <-100- myras2 *100plot(effic_yield_gap_perc, main ="Efficiency yield gap(%)")
Code
# Efficiency yield## Krige yields over the area of intersect## use the predicted yields and efficiency scorelibrary(bamlss)set.seed(111)effdt_battese$L.tonPerHectare <-exp(effdt_battese$`log(L.tonPerHectare)`)f_y <- L.tonPerHectare ~s(lon, lat)## estimate model.b_y <-bamlss(f_y, data = effdt_battese)
Our goal here to show the spatial variation in the effect of a variable on the efficiency scores
Using formula from Pross et al (2018), technical efficiency for each grid (note that this is the same formula as the Battese formula for each farm) can be computed as:
sow_raster <-raster(sow_ras)sow_raster <-mask(sow_raster, India_aoi_sf_dis_sp)plot(sow_raster, main ="Spatially differentiated effect of early sowing on mu")
The contributions of the different practices to reducing the yield gaps can be expressed in the units of yields, i.e., tons/ha. With this we can then compare which of the practices deliver more yield gap reduction thereby prioritizing which of the practices should be targeted where. We use the marginal efficiency formula that has been separately derived by Liu and Myers (2008), Olsen and Henningsen (2011) and Pross et al (2018).
5 Impact of z on TE
5.1 Non-spatial model
Code
## Effect of sowing date and weed management practices on inefficiencyte_score_sowing <- frontier::efficiencies(sfa_f, margEff =TRUE)Z_sowing <-attributes(te_score_sowing)Z_sowing_dt <-as.data.frame(Z_sowing$margEff)ggplot(data = Z_sowing_dt) +geom_histogram(aes(x = efficiency.Z_Sowing_Date_Early), bins =20) +# scale_x_continuous(breaks = scales::pretty_breaks(n = 5), expand = c(0, 0)) +labs(title ="Z sowing marginal efficiency", x ="Marginal efficiency") +theme_bw()
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.004474 0.025856 0.040612 0.037840 0.051216 0.055711
Early sowing of wheat increases efficiency by 5.6 percentage points while weeding improves efficiency by about 3.7 percentage points.
Translating this in the context of yield gaps it means that early sowing reduces efficiency yield gap by the same percentage points say from 50 percent to 45 percent. In terms of the quantities, one can also take the same percentage point contribution.
We also note that the partial effect on efficiency is the same as the partial effect on yield. That means that the percentage point increase in efficiency due to early sowing or other management variable will also increase yields by the same amount. This applies in the linear models (i.e., where the variable enters the inefficiency model linearly) but in the case of the non-linear models, one needs a variant of the formula to account for the fact that the partial effect may vary between the efficiency and the yields.
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.588755 -0.071266 -0.031679 -0.056670 -0.007815 2.606674
Code
summary(tau_hat$sowing_effect_on_TE_final)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.588755 -0.071266 -0.031679 -0.056670 -0.007815 2.606674
Code
# Mapping the results names(pred)[1:2] <-c("lon", "lat")effdt_battese2 <-cbind(lon, lat, tau_hat)effdt_battese2_sp <-SpatialPointsDataFrame(cbind(effdt_battese2$lon, effdt_battese2$lat), data = effdt_battese2, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("view")tm_shape(effdt_battese2_sp) +tm_dots(col ="sowing_effect_on_TE_final", title ="Impact of early sowing on inefficiency", style ="quantile") +tm_layout(legend.outside =TRUE)
Code
# Mapping the estimates: NOT WORKING PROPERLYf_y_hat_TE <- sowing_effect_on_TE_final ~s(lon, lat)## estimate model.b_y_hat_TE <-bamlss(f_y_hat_TE, data = effdt_battese2)
library(raster)myras2_mu_y_TE <-raster(myras_mu_y_TE)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2_mu_y_TE <-mask(myras2_mu_y_TE, India_aoi_sf_dis_sp)plot(-1*myras2_mu_y_TE, main ="Impact of early sowing on efficiency")
6 Impact of Z variables on optimal input use [slack resources]
To understand by how much variables that improve efficiency can allow a farmer to produce the same amount of output but using less resources, we estimate the marginal effect of these variables on slack resources: that is the difference between optimal use of resources and the current levels of use of resources.
We focus on application on reducing N overuse.
Following Pross et al, the marginal effect of Z on the slack resources is: \[
\Delta X_k=X_k^* -X_k
\]
# estimate efficiency yield gap (%)data_new["efficiency_yg"] <-100- (data_new["te_score_cd"] *100)# select relevant columnsdata_new <- data_new[c("zone_new", "season_year", "hhid", "plotid", "subplotid", "te_score_cd","efficiency_yg")]# merge the new columns to original data framedata <-merge(data, data_new, by =c("zone_new", "season_year", "hhid", "subplotid"), all.x = T)# estimate technical efficiency yield (t/ha)data["ytex_tha"] <- data["yield_tha"] / data["te_score_cd"]# create an empty data framedata_final <-data.frame()# create loop per yearfor (yr inunique(data$year)) { subset_year <-subset(data, year == yr)# create loop per climate zonefor (cz inunique(subset_year$gyga_cz)) { subset_cz <-subset(subset_year, gyga_cz == cz)# create loop per soil typefor (soil inunique(subset_cz$soil_fertility)) { subset_soil <-subset(subset_cz, soil_fertility == soil)# create column with field class based on yield distribution subset_soil$field_class <-ifelse(subset_soil$yield_tha >=quantile(subset_soil$yield_tha, 0.90),"YHF", "" ) subset_soil$field_class <-ifelse(subset_soil$yield_tha <=quantile(subset_soil$yield_tha, 0.10),"YLF", subset_soil$field_class ) subset_soil$field_class <-ifelse(subset_soil$yield_tha >quantile(subset_soil$yield_tha, 0.10) & subset_soil$yield_tha <quantile(subset_soil$yield_tha, 0.90),"YAF", subset_soil$field_class )# subset highest yielding fields only yhf <-subset(subset_soil, field_class =="YHF")# add column with yhf in t/ha to data frame subset_soil["yhf_tha"] <-mean(yhf$yield_tha, na.rm = T)# bind all individual fields into single data frame data_final <-rbind(data_final, subset_soil) } }}#
10.1 geoadditive efficiency model
10.1.1 Collect geocoodinates or district shapefiles z
Code
# Write formulae for parameters for the mean model, sigma model, and inefficiency model# We use the original functional forms: identity for mean model, log sigma for sigma model and log ineff for the inefficiency# mu_formula <- yield_tha ~# season_year + gyga_gdd + gyga_tseas + seed_kgha + variety +# gyga_ai + gyga_av_water + soil_depth + soil_fertility + waterlogging_yn + drought_yn + soilwatercons_yn +# nfert_kgha + manure_yn + residues_yn + previous_crop + oxplough_freq_cat +# herb_lha + handweeding_persdayha + weeding_yn + pesticide_yn + disease_incidence_yn + pest_incidence_yn# sigma_v_formula <- ~ 1 # sigma_u_formula <- ~ 1 + weeding_yn#s <- -1 # production function specification# Fit model# If using older versions of R you can use the following code# model <- mgcv::gam(# formula = list(mu_formula, sigma_v_formula, sigma_u_formula),# data = LDS_wheat_public_cleaned, family = normhnorm(s = s), optimizer = c("efs")# )# Eth_model_dsfa<-dsfa(formula=list(mu_formula, sigma_v_formula, sigma_u_formula),# data=data_final, family=comper(s=s, distr="normhnorm"), optimizer = c("efs"))# summary(Eth_model_dsfa)# plot(Eth_model_dsfa, select = 1) # Estimated function# plot(Eth_model_dsfa, select = 2)
11 Discussion: Yield gap analysis as a prioritization framework
We turn next to discuss how yield gap analysis using stochastic frontier analysis can be used as a prioritization framework. Similar ideas can be extended to account for bad outputs like GHG and multiple outputs.
12 Conclusion
13 References
Liu, Y., and Myers, R. 2009. “Model selection in stochastic frontier analysis with an application to maize production in Kenya”. Journal of Productivity Analysis 31: 33-46. Doi: https://doi.org/10.1007/s11123-008-0111-9.
Olsen, J.V., and Henningsen, A.2011. “Investment Utilisation, Adjustment Costs, and Technical Efficiency in Danish Pig Farms.” FOI Working Paper, No. 2011/13, University of Copenhagen, Department of Food and Resource Economics (IFRO),Copenhagen.
Pross, C., Strumann, C., Geissler, A., Herwatz, H., and Klein, N. 2018. “Quality and resource efficiency in hospital service provision: A geoadditive stochastic frontier analysis of stroke quality of care in Germany”. PLOS One. Doi:https://doi.org/10.1371/journal.pone.0203017.
Source Code
---title: "Spatially explicit yield gap decomposition by management and policy variables using Bayesian geoadditive distributional efficiency approach"format: html: code-fold: true code-tools: truefig-dpi: 300fig-width: 8.88fig-align: centerfig-height: 5self-contained: trueeditor: visualtoc: truetoc-location: leftnumber-sections: trueexecute: message: false warning: false echo: true---# IntroductionWe follow Silva (###) approach of calculating and decomposing yield gaps in agriculture. In this approach, yield gaps are decomposed into technology yield gaps, resource yield gaps and efficiency yield gaps.We address two challenges and potential advancements that have been identified in the literature on this approach, (i) granular decomposition of yield gaps beyond technology, resource, and efficiency yield gaps to actually consider the contribution of management and policy relevant variables like early sowing, and weed management by decomposing the efficiency yield gaps, and contribution to reducing inputs through decomposition of the resource gaps, and (2) spatially granular decomposition of yield gaps by demonstrating in which location or grid is each of the management practices more relevant to close the yield gaps.In this workbook we will a recent approach of estimating marginal impact on efficiency for each of the efficiency variables at spatially disaggregated level and demonstrating that this approach can allow one to explore the sources of inefficiency at a spatially disaggregated level.This then allow one to decompose the efficiency yield gap into policy and action oriented components for which farmers and other stakeholders can attempt to make adjustments. We also follow Pross et al () in identify resource yield gaps and identifying which input usage can be reallocated. We do these decompositions at a spatially disaggregated level thereby allowing one to identify if policies related to say subsidies or input assistance can help in increasing yields or whether other management mechanisms (for example split application) can be more helpful in a particular location.On understanding which aspects of the efficieny yield gaps should be prioritized where, we decompose these into genotype, environment, management and socioeconomics (GEMS).# Conventional Yield Gap Decomposition (following on Silva et al)```{r}# package namespackages <-c("ggplot2", "micEcon", "frontier", "dplyr", "tidyr", "knitr", "car", "RColorBrewer", "DT", "rio", "tidyr", "dsfa", "mgcv", "geodata", "sf", "mapview", "dplyr", "terra", "raster")# install packagesinstalled_packages <- packages %in%rownames(installed.packages())if (any(installed_packages ==FALSE)) {install.packages(packages[!installed_packages])}# load packagesinvisible(lapply(packages, library, character.only =TRUE))# install.packages("collapse", repos = "https://fastverse.r-universe.dev")LDS_wheat_public_cleaned <-import("LDS_wheat_public_cleaned.csv")# Using sfa function in frontiersfa_f <-sfa(log(L.tonPerHectare) ~ Nperha + wc2.1_30s_elev + temp + precip | Sowing_Date_Early + Weedmanaged, data = LDS_wheat_public_cleaned)# see parameter estimatessummary(sfa_f)# Efficiency scoreseff <-efficiencies(sfa_f, type ="battese")eff <-as.data.frame(eff)ggplot(data = eff) +geom_histogram(aes(x = efficiency), bins =20) +# scale_x_continuous(breaks = scales::pretty_breaks(n = 5), expand = c(0, 0)) +labs(title ="Battese Efficiency Score", x ="Efficiency score") +theme_bw()``````{r}# Yield gapeff_yield_gap_perc <-100- (eff$efficiency *100)eff_yield_gap_perc <-as.data.frame(eff_yield_gap_perc)ggplot(data = eff_yield_gap_perc) +geom_histogram(aes(x = eff_yield_gap_perc), bins =20) +# scale_x_continuous(breaks = scales::pretty_breaks(n = 5), expand = c(0, 0)) +labs(title ="Efficiency yield gap", x ="Efficiency yield gap") +theme_bw()eff_d <- sfa_f$dataTableeff_d <-as.data.frame(eff_d)eff_d$ln_yield <- eff_d$`log(L.tonPerHectare)`eff_d$yield_t_ha <-exp(eff_d$ln_yield)# Efficiency yield: maximum yieldeff_d$max_eff_yield <- eff_d$yield_t_ha / eff$efficiencyggplot(data = eff_d) +geom_histogram(aes(x = max_eff_yield), bins =20) +labs(title ="Maximum efficiency yield", x ="Maximum efficiency yield") +theme_bw()# Yield gap in tons per haeff_d$yield_gap_t_ha <- eff_d$max_eff_yield - eff_d$yield_t_haggplot(data = eff_d) +geom_histogram(aes(x = yield_gap_t_ha), bins =20) +labs(title ="Efficiency yield gap(t/ha)", x ="Efficiency yield gap(t/ha)") +theme_bw()# Yield gap reduction due to early sowing in tons/haeff_d$early_sowing_yield_gap_redn <- eff_d$yield_gap_t_ha * eff$efficiencyggplot(data = eff_d) +geom_histogram(aes(x = early_sowing_yield_gap_redn), bins =20) +labs(title ="Early sowing effect (t/ha)", x ="Early sowing effect (t/ha)") +theme_bw()# Input savings due to early sowing in tons per ha```# Geoadditive stochastic frontier analysis## Estimation using thin plate spline```{r}# LDSestim_gam=LDSestim %>% drop_na()# Write formulae for parameters for the mean model, sigma model, and inefficiency model# We use the original functional forms: identity for mean model, log sigma for sigma model and log ineff for the inefficiencymu_formula <-log(L.tonPerHectare)~Nperha + wc2.1_30s_elev +s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude,by = Weedmanaged)sigma_v_formula <-~1+ temp + precipsigma_u_formula <-~1+s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude, by = Sowing_Date_Early) +s(O.largestPlotGPS.Longitude, O.largestPlotGPS.Latitude,by = Weedmanaged)s <--1# production function specification# Fit model# If using older versions of R you can use the following code# model <- mgcv::gam(# formula = list(mu_formula, sigma_v_formula, sigma_u_formula),# data = LDS_wheat_public_cleaned, family = normhnorm(s = s), optimizer = c("efs")# )model_dsfa<-dsfa(formula=list(mu_formula, sigma_v_formula, sigma_u_formula),data=LDS_wheat_public_cleaned, family=comper(s=s, distr="normhnorm"), optimizer =c("efs"))summary(model_dsfa)plot(model_dsfa, select =1) # Estimated functionplot(model_dsfa, select =2) ```## Efficiency calculations```{r}# fitted valuestau_hat=model_dsfa$fitted.valuestau_hat=as.data.frame(tau_hat)names(tau_hat)[1:3]=c("mu","sigma_v","sigma_u")#Manual calculation of efficiency: Battesetau_hat$sigma_c<-sqrt((tau_hat$sigma_u^2*tau_hat$sigma_v^2)/(tau_hat$sigma_v^2+tau_hat$sigma_u^2))#(1/sigma_u^2+1/sigma_v^2)^(-1)model_dsfa_data=model_dsfa$modeltau_hat$mu_c<--(model_dsfa_data$`log(L.tonPerHectare)`-tau_hat$mu)/tau_hat$sigma_v^2*tau_hat$sigma_c^2#object$family$s*mgcv::residuals.gam(object)/sigma_v^2*sigma_c^2tau_hat$u<-exp(1/2*tau_hat$sigma_c^2-tau_hat$mu_c)*stats::pnorm(tau_hat$mu_c/tau_hat$sigma_c-tau_hat$sigma_c)/stats::pnorm(tau_hat$mu_c/tau_hat$sigma_c)# Package based calculationtau_hat$eff <-efficiency(model_dsfa, type ="battese")# Jondrow efficiency measuretau_hat$sigma_c_j<-sqrt((tau_hat$sigma_u^2*tau_hat$sigma_v^2)/(tau_hat$sigma_v^2+tau_hat$sigma_u^2))#(1/sigma_u^2+1/sigma_v^2)^(-1)tau_hat$mu_c_j<--(model_dsfa_data$`log(L.tonPerHectare)`-tau_hat$mu)/tau_hat$sigma_v^2*tau_hat$sigma_c_j^2#object$family$s*mgcv::residuals.gam(object)/sigma_v^2*sigma_c^2tau_hat$u_jondrow<-tau_hat$mu_c_j+tau_hat$sigma_c_j*stats::dnorm(tau_hat$mu_c_j/tau_hat$sigma_c_j)/stats::pnorm(tau_hat$mu_c_j/tau_hat$sigma_c_j)tau_hat$eff_j <-efficiency(model_dsfa, type ="jondrow")# CI_lower<-mu_c+stats::qnorm(lower)*sigma_c# CI_upper<-mu_c+stats::qnorm(upper)*sigma_c# CI_lower<-exp(-CI_upper)# CI_upper<-exp(-CI_lower)```## Geoadditive mapping of the efficiency scores, and efficiency yield gaps```{r}lon <- model_dsfa_data$O.largestPlotGPS.Longitudelon <-as.data.frame(lon)lat <- model_dsfa_data$O.largestPlotGPS.Latitudelat <-as.data.frame(lat)effdt_battese <-cbind(lon, lat, model_dsfa_data, tau_hat)library(bamlss)set.seed(111)f <- u ~s(lon, lat)## estimate model.b <-bamlss(f, data = effdt_battese)# Boundary mapIndia <-gadm(country ="IND", level =2, path ="shp")plot(India)India_aoi <-subset(India, India$NAME_1 =="Bihar"| India$NAME_2 %in%c("Ballia", "Chandauli", "Deoria", "Ghazipur", "Kushinagar", "Maharajganj", "Mau", "Siddharth Nagar", "Gorakhpur"))plot(India_aoi)India_aoi_sf <-st_as_sf(India_aoi)mapview(India_aoi_sf)India_aoi_sf_dis <-st_union(India_aoi_sf)# Predictelevationglobal_geodata <-elevation_global(0.5, path ="raster")elevationglobal_geodata_aoi <- terra::crop(elevationglobal_geodata, India_aoi_sf_dis)library(raster)elevationglobal_geodata_aoi <-raster(elevationglobal_geodata_aoi)plot(elevationglobal_geodata_aoi)pred <-SpatialPoints(elevationglobal_geodata_aoi)@coordspred <-as.data.frame(pred)names(pred)[1:2] <-c("lon", "lat")# pred <- expand.grid(lon = seq(82, 89, length = 100),lat = seq(24,28, length = 100))effdt_battese_hat <-predict(b, newdata = pred)effdt_battese_hat <-as.data.frame(effdt_battese_hat)pred_effdt_battese_hat <-cbind(pred, effdt_battese_hat)pred_effdt_battese_hat$sigma <-NULLlibrary(terra)myras <-rast(pred_effdt_battese_hat, type ="xyz")plot(myras)library(raster)myras2 <-raster(myras)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2 <-mask(myras2, India_aoi_sf_dis_sp)plot(myras2, main ="Gridded efficiency scores")# Estimate the efficiency yield gap (%)effic_yield_gap_perc <-100- myras2 *100plot(effic_yield_gap_perc, main ="Efficiency yield gap(%)")# Efficiency yield## Krige yields over the area of intersect## use the predicted yields and efficiency scorelibrary(bamlss)set.seed(111)effdt_battese$L.tonPerHectare <-exp(effdt_battese$`log(L.tonPerHectare)`)f_y <- L.tonPerHectare ~s(lon, lat)## estimate model.b_y <-bamlss(f_y, data = effdt_battese)effdt_battese_hat_y <-predict(b_y, newdata = pred)effdt_battese_hat_y <-as.data.frame(effdt_battese_hat_y)pred_effdt_battese_hat_y <-cbind(pred, effdt_battese_hat_y)pred_effdt_battese_hat_y$sigma <-NULLlibrary(terra)myras_y <-rast(pred_effdt_battese_hat_y, type ="xyz")plot(myras_y)library(raster)myras2_y <-raster(myras_y)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2_y <-mask(myras2_y, India_aoi_sf_dis_sp)plot(myras2_y, main ="Gridded wheat yields (t/ha)")# Maximum efficient yieldmyras_max_eff_yield <- myras2_y / myras2plot(myras_max_eff_yield, main ="Maximum efficient yield(t/ha)")# Predicted yields from the modellibrary(bamlss)set.seed(111)effdt_battese$exp_mu <-exp(effdt_battese$mu)f_y_hat <- exp_mu ~s(lon, lat)## estimate model.b_y_hat <-bamlss(f_y_hat, data = effdt_battese)effdt_battese_hat_mu_y <-predict(b_y_hat, newdata = pred)effdt_battese_hat_mu_y <-as.data.frame(effdt_battese_hat_mu_y)pred_effdt_battese_hat_mu_y <-cbind(pred, effdt_battese_hat_mu_y)pred_effdt_battese_hat_mu_y$sigma <-NULLlibrary(terra)myras_mu_y <-rast(pred_effdt_battese_hat_mu_y, type ="xyz")plot(myras_mu_y)library(raster)myras2_mu_y <-raster(myras_mu_y)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2_mu_y <-mask(myras2_mu_y, India_aoi_sf_dis_sp)plot(myras2_mu_y, main ="Gridded predicted wheat yields (t/ha)")```# Geospatial mapping of the efficiency parametersOur goal here to show the spatial variation in the effect of a variable on the efficiency scoresUsing formula from Pross et al (2018), technical efficiency for each grid (note that this is the same formula as the Battese formula for each farm) can be computed as:$$ TE=E[exp(-u|\epsilon)]=\frac{exp(-u+0.5\sigma^2) \Phi (u/\sigma -\sigma)}{\Phi (u/\sigma)} $$$$TE=a\times b \times c$$ Where $$a=exp(-u+0.5\sigma^2)$$$$b= \Phi (u/\sigma -\sigma)$$$$c=[\Phi (u/\sigma)]^{-1}$$$$u=\frac{-\epsilon \sigma_u^2}{\sigma_u^2-\sigma_v^2}$$$$\sigma^2=\frac{\sigma_u^2 \sigma_v^2}{\sigma_u^2-\sigma_v^2}$$$$\sigma_u^2=(\sigma_u)^2 \alpha= (\sigma_u*)^2 exp(\eta^u)= (\sigma_u*)^2 exp(Z'\beta^u)$$$$\epsilon=y-\eta^y$$The marginal effect of the explanatory variable on TE is given by:$$ME=\frac{\partial TE}{\partial z}$$$$ME=a\times (0.5 \times d-e)\times b \times c+ a\times \phi (u/\sigma -\sigma) (g-f)\times c - a \times b \times c^2 \times \phi \times (u/sigma) \times g$$$$d=\beta_k^u \times (\frac{\sigma^2}{\sigma_u^2})$$$$e=-\epsilon \times \beta_k^u (\frac{\sigma_u^2 \sigma_v^2}{(\sigma_u^2 + \sigma_v^2})^2)$$$$f=0.5 \times \sigma_u \sigma_v \beta^u ((\sigma_u^2 +\sigma_v^2)^{-0.5}-(\sigma_u^2 +\sigma_v^2)^{-3/2} \sigma_u^2)$$$$g=\frac{\sigma^2 e -uf}{\sigma^2}$$```{r}names(pred)[1:2] <-c("O.largestPlotGPS.Longitude", "O.largestPlotGPS.Latitude")pred$Sowing_Date_Early <-1pred$Nperha <-mean(model_dsfa_data$Nperha)pred$Weedmanaged <-mean(model_dsfa_data$Weedmanaged)pred$temp <-mean(model_dsfa_data$temp)pred$precip <-mean(model_dsfa_data$precip)pred$wc2.1_30s_elev <-mean(model_dsfa_data$wc2.1_30s_elev)sowingdate_effect_on_inefficiency <-predict.gam(model_dsfa, newdata = pred, model ="sigma_u", term ="s.2(O.largestPlotGPS.Longitude,O.largestPlotGPS.Latitude):Sowing_Date_Early")sowingdate_effect_on_inefficiency <-as.data.frame(sowingdate_effect_on_inefficiency)sowingdate_effect_on_inefficiency <-cbind(pred[, c("O.largestPlotGPS.Longitude", "O.largestPlotGPS.Latitude")], sowingdate_effect_on_inefficiency)sowingdate_effect_on_inefficiency$V1 <-NULLsowingdate_effect_on_inefficiency$V2 <-NULLsowingdate_effect_on_inefficiency <-rename(sowingdate_effect_on_inefficiency, Early_Sowing = V3)sow_ras <-rast(sowingdate_effect_on_inefficiency, type ="xyz")plot(sow_ras)sow_raster <-raster(sow_ras)sow_raster <-mask(sow_raster, India_aoi_sf_dis_sp)plot(sow_raster, main ="Spatially differentiated effect of early sowing on mu")```The contributions of the different practices to reducing the yield gaps can be expressed in the units of yields, i.e., tons/ha. With this we can then compare which of the practices deliver more yield gap reduction thereby prioritizing which of the practices should be targeted where. We use the marginal efficiency formula that has been separately derived by Liu and Myers (2008), Olsen and Henningsen (2011) and Pross et al (2018).# Impact of z on TE## Non-spatial model```{r}## Effect of sowing date and weed management practices on inefficiencyte_score_sowing <- frontier::efficiencies(sfa_f, margEff =TRUE)Z_sowing <-attributes(te_score_sowing)Z_sowing_dt <-as.data.frame(Z_sowing$margEff)ggplot(data = Z_sowing_dt) +geom_histogram(aes(x = efficiency.Z_Sowing_Date_Early), bins =20) +# scale_x_continuous(breaks = scales::pretty_breaks(n = 5), expand = c(0, 0)) +labs(title ="Z sowing marginal efficiency", x ="Marginal efficiency") +theme_bw()summary(Z_sowing_dt$efficiency.Z_Sowing_Date_Early)## Weed managementggplot(data = Z_sowing_dt) +geom_histogram(aes(x = efficiency.Z_Weedmanaged), , bins =20) +# scale_x_continuous(breaks = scales::pretty_breaks(n = 5), expand = c(0, 0)) +labs(title ="Z weeding marginal efficiency", x ="Marginal efficiency") +theme_bw()summary(Z_sowing_dt$efficiency.Z_Weedmanaged)```Early sowing of wheat increases efficiency by 5.6 percentage points while weeding improves efficiency by about 3.7 percentage points.Translating this in the context of yield gaps it means that early sowing reduces efficiency yield gap by the same percentage points say from 50 percent to 45 percent. In terms of the quantities, one can also take the same percentage point contribution.We also note that the partial effect on efficiency is the same as the partial effect on yield. That means that the percentage point increase in efficiency due to early sowing or other management variable will also increase yields by the same amount. This applies in the linear models (i.e., where the variable enters the inefficiency model linearly) but in the case of the non-linear models, one needs a variant of the formula to account for the fact that the partial effect may vary between the efficiency and the yields.## Spatial model```{r}tau_hat$sigma_c <-sqrt((tau_hat$sigma_u^2* tau_hat$sigma_v^2) / (tau_hat$sigma_v^2+ tau_hat$sigma_u^2)) # (1/sigma_u^2+1/sigma_v^2)^(-1)model_dsfa_data <- model_dsfa$modeltau_hat$mu_c <--(model_dsfa_data$`log(L.tonPerHectare)`- tau_hat$mu) / tau_hat$sigma_v^2* tau_hat$sigma_c^2# object$family$s*mgcv::residuals.gam(object)/sigma_v^2*sigma_c^2# tau_hat$epsilon_gam <- model_dsfa$family$s * mgcv::residuals.gam(model_dsfa)# tau_hat$mu_cc <- -(tau_hat$epsilon_gam) / tau_hat$sigma_v^2 * tau_hat$sigma_c^2 # object$family$s*mgcv::residuals.gam(object)/sigma_v^2*sigma_c^2# Efficency estimates based on the formulatau_hat$u <-exp(1/2* tau_hat$sigma_c^2- tau_hat$mu_c) * stats::pnorm(tau_hat$mu_c / tau_hat$sigma_c - tau_hat$sigma_c) / stats::pnorm(tau_hat$mu_c / tau_hat$sigma_c) # Impact of Z variables on efficiency yield gap [1-TE]# Pross formula simplificationtau_hat$a <-exp(1/2* tau_hat$sigma_c^2- tau_hat$mu_c)tau_hat$b <- stats::pnorm(tau_hat$mu_c / tau_hat$sigma_c - tau_hat$sigma_c)tau_hat$c <-1/ (stats::pnorm(tau_hat$mu_c / tau_hat$sigma_c))tau_hat$u_pross_formula <- tau_hat$a * tau_hat$b * tau_hat$csummary(tau_hat$u_pross_formula)# Predict the effect on inefficiency term (eta or mu)sowing_trted_dt <- model_dsfa$modelsowing_trted_dt$Sowing_Date_Early <-1sowingdate_effect_on_inefficiency_obs <-predict.gam(model_dsfa, newdata = sowing_trted_dt, model ="sigma_u", term ="s.2(O.largestPlotGPS.Longitude,O.largestPlotGPS.Latitude):Sowing_Date_Early")sowingdate_effect_on_inefficiency_obs <-as.data.frame(sowingdate_effect_on_inefficiency_obs)sowingdate_effect_on_inefficiency_obs$V1 <-NULLsowingdate_effect_on_inefficiency_obs$V2 <-NULLsowingdate_effect_on_inefficiency_obs <-rename(sowingdate_effect_on_inefficiency_obs, sowing_date_early = V3)# ME of Ztau_hat$d <- sowingdate_effect_on_inefficiency_obs$sowing_date_early * (tau_hat$sigma_c^2/ tau_hat$sigma_u^2)tau_hat$e <--(model_dsfa_data$`log(L.tonPerHectare)`- tau_hat$mu) * sowingdate_effect_on_inefficiency_obs$sowing_date_early * (tau_hat$sigma_v^2* tau_hat$sigma_u^2) / ((tau_hat$sigma_v^2+ tau_hat$sigma_u^2)^2)tau_hat$f <-0.5* tau_hat$sigma_u * tau_hat$sigma_v * sowingdate_effect_on_inefficiency_obs$sowing_date_early * (((tau_hat$sigma_u^2+ tau_hat$sigma_v^2)^(-0.5)) - (((tau_hat$sigma_u + tau_hat$sigma_v)^(-1.5)) * (tau_hat$sigma_u^2)))tau_hat$g <- ((tau_hat$sigma_c * tau_hat$e) - (tau_hat$mu_c * tau_hat$f)) / ((tau_hat$sigma_c)^2)tau_hat$sowing_effect_on_TE1 <- tau_hat$a * (0.5* tau_hat$d - tau_hat$e) * tau_hat$b * tau_hat$ctau_hat$density_mu_c_sig_c <- stats::dnorm((tau_hat$mu_c / tau_hat$sigma_c) - tau_hat$sigma_c)tau_hat$sowing_effect_on_TE2 <- tau_hat$a * tau_hat$density_mu_c_sig_c * (tau_hat$g - tau_hat$f) * tau_hat$sigma_ctau_hat$density_mu_c_sig_c2 <- stats::dnorm((tau_hat$mu_c / tau_hat$sigma_c))tau_hat$sowing_effect_on_TE3 <- tau_hat$a * tau_hat$b * ((tau_hat$c)^2) * tau_hat$density_mu_c_sig_c2 * tau_hat$gtau_hat$sowing_effect_on_TE_final <- tau_hat$sowing_effect_on_TE1 + tau_hat$sowing_effect_on_TE2 - tau_hat$sowing_effect_on_TE3tau_hat$sowing_effect_on_TE_final <-as.numeric(tau_hat$sowing_effect_on_TE_final)summary(tau_hat$sowing_effect_on_TE_final)summary(tau_hat$sowing_effect_on_TE_final)# Mapping the results names(pred)[1:2] <-c("lon", "lat")effdt_battese2 <-cbind(lon, lat, tau_hat)effdt_battese2_sp <-SpatialPointsDataFrame(cbind(effdt_battese2$lon, effdt_battese2$lat), data = effdt_battese2, proj4string =CRS("+proj=longlat +datum=WGS84"))library(tmap)tmap_mode("view")tm_shape(effdt_battese2_sp) +tm_dots(col ="sowing_effect_on_TE_final", title ="Impact of early sowing on inefficiency", style ="quantile") +tm_layout(legend.outside =TRUE)# Mapping the estimates: NOT WORKING PROPERLYf_y_hat_TE <- sowing_effect_on_TE_final ~s(lon, lat)## estimate model.b_y_hat_TE <-bamlss(f_y_hat_TE, data = effdt_battese2)effdt_battese_hat_mu_y_TE <-predict(b_y_hat_TE, newdata = pred)effdt_battese_hat_mu_y_TE <-as.data.frame(effdt_battese_hat_mu_y_TE)pred_effdt_battese_hat_mu_y_TE <-cbind(pred, effdt_battese_hat_mu_y_TE)pred_effdt_battese_hat_mu_y_TE$sigma <-NULLpred_effdt_battese_hat_mu_y_TE$Sowing_Date_Early <-NULLpred_effdt_battese_hat_mu_y_TE$Nperha <-NULLpred_effdt_battese_hat_mu_y_TE$Weedmanaged <-NULLpred_effdt_battese_hat_mu_y_TE$temp <-NULLpred_effdt_battese_hat_mu_y_TE$precip <-NULLpred_effdt_battese_hat_mu_y_TE$wc2.1_30s_elev <-NULLlibrary(terra)myras_mu_y_TE <-rast(pred_effdt_battese_hat_mu_y_TE, type ="xyz")plot(myras_mu_y_TE)library(raster)myras2_mu_y_TE <-raster(myras_mu_y_TE)library(sf)India_aoi_sf_dis_sp <-as_Spatial(India_aoi_sf_dis)myras2_mu_y_TE <-mask(myras2_mu_y_TE, India_aoi_sf_dis_sp)plot(-1*myras2_mu_y_TE, main ="Impact of early sowing on efficiency")```# Impact of Z variables on optimal input use \[slack resources\]To understand by how much variables that improve efficiency can allow a farmer to produce the same amount of output but using less resources, we estimate the marginal effect of these variables on slack resources: that is the difference between optimal use of resources and the current levels of use of resources.We focus on application on reducing N overuse.Following Pross et al, the marginal effect of Z on the slack resources is: $$\Delta X_k=X_k^* -X_k$$$$\Delta X_k = X_k (\frac{1}{(1+ME/TE)^{1/\beta_k}}-1)$$# How much input can be reduced to achieve same level of output due to increase in efficiency# Non spatial model```{r}coeffs <-coefficients(sfa_f)coeffs <-as.data.frame(coeffs)slack_sow_N_effect_non_sp <- eff_d$Nperha*(1/(1+(Z_sowing_dt$efficiency.Z_Sowing_Date_Early/eff$efficiency)^(1/0.0018))-1)summary(slack_sow_N_effect_non_sp)slack_weed_N_effect_non_sp <- eff_d$Nperha*(1/(1+(Z_sowing_dt$efficiency.Z_Weedmanaged/eff$efficiency)^(1/0.0018))-1)summary(slack_weed_N_effect_non_sp)```# Spatial model```{r}slack_sow_N_effect_sp <- model_dsfa_data$Nperha*(1/(1+(-1*tau_hat$sowing_effect_on_TE_final/tau_hat$eff)^(1/0.001315))-1)summary(slack_sow_N_effect_sp)```# Ethiopia case studyWe copy code from Silva () repository to showcase how the new method can be implement```{r}# package namespackages <-c("frontier", "dplyr", "tidyr", "knitr", "car", "RColorBrewer", "DT")# install packagesinstalled_packages <- packages %in%rownames(installed.packages())if (any(installed_packages ==FALSE)) {install.packages(packages[!installed_packages])}# load packagesinvisible(lapply(packages, library, character.only =TRUE))# read .csv file with datafile <-"https://raw.githubusercontent.com/jvasco323/EiA_YGD_workflow/main/data-wheat-ethiopia.csv"data <-read.csv(url(file))# list variables of intereststr(data)# create final datadata <-subset(data, yield_tha >0)data <-subset(data, residues_yn =="No"| residues_yn =="Yes")data <-subset(data, soil_slope =="Flat"| soil_slope =="Medium"| soil_slope =="Steep")data <-subset(data, zone_new !="")data <-subset(data, oxplough_freq_cat =="<Two"| oxplough_freq_cat =="Three"| oxplough_freq_cat =="Four"| oxplough_freq_cat ==">Five")data <-subset(data, weeding_freq_cat =="None"| weeding_freq_cat =="One"| weeding_freq_cat =="Two"| weeding_freq_cat =="Three+")# fill NA valuesdata$seed_kgha[is.na(data$seed_kgha)] <-mean(data$seed_kgha, na.rm = T)data$nfert_kgha[is.na(data$nfert_kgha)] <-0data$herb_lha[is.na(data$herb_lha)] <-0data$handweeding_persdayha[is.na(data$handweeding_persdayha)] <-0# reclassify categorical variablesdata$variety <-ifelse(data$variety !="Landrace"& data$variety !="unknown", "Improved", data$variety)data$variety <-ifelse(data$variety =="Landrace", "unknown", data$variety)data$nfert_yn <-ifelse(data$nfert_kgha ==0, "N0", "N+")data$weeding_yn <-ifelse(data$herb_lha ==0& data$handweeding_persdayha ==0, "No", "Yes")# copy df with transformed datadata_new <- data# replace 0 with small value for log-transformationdata_new[data_new ==0] <-0.0001# log-transform continuous variablesvars1 <-c("gyga_gdd", "gyga_tseas", "seed_kgha", "gyga_ai", "gyga_av_water", "nfert_kgha", "pfert_kgha","herb_lha", "handweeding_persdayha", "yield_tha")log_f <-function(x) {log(x)}data_new[, vars1] <-lapply(data_new[, vars1], log_f)# set categorical variables to factorvars2 <-c("farming_system", "aez", "zone_new", "season_year", "variety", "soil_depth", "soil_fertility","waterlogging_yn", "drought_yn", "soilwatercons_yn", "manure_yn", "residues_yn", "previous_crop","oxplough_freq_cat", "weeding_yn", "pesticide_yn", "disease_incidence_yn", "pest_incidence_yn")data_new[, vars2] <-lapply(data_new[, vars2], factor)vars3 <-c("year", "handweeding_persdayha", "herb_lha", "pfert_kgha", "nfert_kgha", "seed_kgha", "subplotsize_ha", "yield_tha")# meanlibrary(dplyr)numeric_cols_mean <- data[, vars3] %>%group_by(year) %>%summarise(across(.cols =where(is.numeric),.fns =list(Mean = mean), na.rm =TRUE,.names ="{col}" ))numeric_cols_mean <-round(numeric_cols_mean, 2)numeric_cols_mean <-t(numeric_cols_mean)colnames(numeric_cols_mean)[2] <-"Mean 2013"colnames(numeric_cols_mean)[1] <-"Mean 2009"numeric_cols_mean <- numeric_cols_mean[-1, ]Variable <-rownames(numeric_cols_mean)rownames(numeric_cols_mean) <-NULLnumeric_cols_mean <-cbind(Variable, numeric_cols_mean)# sdnumeric_cols_sd <- data[, vars3] %>%group_by(year) %>%summarise(across(.cols =where(is.numeric),.fns =list(SD = sd), na.rm =TRUE,.names ="{col}" ))numeric_cols_sd <-round(numeric_cols_sd, 2)numeric_cols_sd <-t(numeric_cols_sd)colnames(numeric_cols_sd)[2] <-"StDev 2013"colnames(numeric_cols_sd)[1] <-"StDev 2009"numeric_cols_sd <- numeric_cols_sd[-1, ]Variable <-rownames(numeric_cols_sd)rownames(numeric_cols_sd) <-NULLnumeric_cols_sd <-cbind(Variable, numeric_cols_sd)# mergenumeric_cols <-merge(numeric_cols_mean, numeric_cols_sd, by ="Variable")numeric_cols$Variable[1] <-"Hand-weeding (person-day/ha)"numeric_cols$Variable[2] <-"Herbicide use (L/ha)"numeric_cols$Variable[3] <-"N application rate (kg N/ha)"numeric_cols$Variable[4] <-"P application rate (kg P/ha)"numeric_cols$Variable[5] <-"Seed rate (kg/ha)"numeric_cols$Variable[6] <-"Plot size (ha)"numeric_cols$Variable[7] <-"Actual wheat yield (t/ha)"# showknitr::kable(numeric_cols)# fit ols regression modelols <-lm( yield_tha ~ season_year + gyga_gdd + gyga_tseas + seed_kgha + variety + gyga_ai + gyga_av_water + soil_depth + soil_fertility + waterlogging_yn + drought_yn + soilwatercons_yn + nfert_kgha + manure_yn + residues_yn + previous_crop + oxplough_freq_cat + herb_lha + handweeding_persdayha + weeding_yn + pesticide_yn + disease_incidence_yn + pest_incidence_yn,data = data_new )# check vif values# vif(ols)# see parameter estimates# summary(ols)# fit cobb-douglas stochastic frontiersfa_cd <-sfa( yield_tha ~ season_year + gyga_gdd + gyga_tseas + seed_kgha + variety + gyga_ai + gyga_av_water + soil_depth + soil_fertility + waterlogging_yn + drought_yn + soilwatercons_yn + nfert_kgha + manure_yn + residues_yn + previous_crop + oxplough_freq_cat + herb_lha + handweeding_persdayha + weeding_yn + pesticide_yn + disease_incidence_yn + pest_incidence_yn,data = data_new )# add technical efficiency score to data framedata_new$te_score_cd <-efficiencies(sfa_cd, asInData = T)# see parameter estimatessummary(sfa_cd)# estimate efficiency yield gap (%)data_new["efficiency_yg"] <-100- (data_new["te_score_cd"] *100)# select relevant columnsdata_new <- data_new[c("zone_new", "season_year", "hhid", "plotid", "subplotid", "te_score_cd","efficiency_yg")]# merge the new columns to original data framedata <-merge(data, data_new, by =c("zone_new", "season_year", "hhid", "subplotid"), all.x = T)# estimate technical efficiency yield (t/ha)data["ytex_tha"] <- data["yield_tha"] / data["te_score_cd"]# create an empty data framedata_final <-data.frame()# create loop per yearfor (yr inunique(data$year)) { subset_year <-subset(data, year == yr)# create loop per climate zonefor (cz inunique(subset_year$gyga_cz)) { subset_cz <-subset(subset_year, gyga_cz == cz)# create loop per soil typefor (soil inunique(subset_cz$soil_fertility)) { subset_soil <-subset(subset_cz, soil_fertility == soil)# create column with field class based on yield distribution subset_soil$field_class <-ifelse(subset_soil$yield_tha >=quantile(subset_soil$yield_tha, 0.90),"YHF", "" ) subset_soil$field_class <-ifelse(subset_soil$yield_tha <=quantile(subset_soil$yield_tha, 0.10),"YLF", subset_soil$field_class ) subset_soil$field_class <-ifelse(subset_soil$yield_tha >quantile(subset_soil$yield_tha, 0.10) & subset_soil$yield_tha <quantile(subset_soil$yield_tha, 0.90),"YAF", subset_soil$field_class )# subset highest yielding fields only yhf <-subset(subset_soil, field_class =="YHF")# add column with yhf in t/ha to data frame subset_soil["yhf_tha"] <-mean(yhf$yield_tha, na.rm = T)# bind all individual fields into single data frame data_final <-rbind(data_final, subset_soil) } }}# ```## geoadditive efficiency model### Collect geocoodinates or district shapefiles z```{r}# Write formulae for parameters for the mean model, sigma model, and inefficiency model# We use the original functional forms: identity for mean model, log sigma for sigma model and log ineff for the inefficiency# mu_formula <- yield_tha ~# season_year + gyga_gdd + gyga_tseas + seed_kgha + variety +# gyga_ai + gyga_av_water + soil_depth + soil_fertility + waterlogging_yn + drought_yn + soilwatercons_yn +# nfert_kgha + manure_yn + residues_yn + previous_crop + oxplough_freq_cat +# herb_lha + handweeding_persdayha + weeding_yn + pesticide_yn + disease_incidence_yn + pest_incidence_yn# sigma_v_formula <- ~ 1 # sigma_u_formula <- ~ 1 + weeding_yn#s <- -1 # production function specification# Fit model# If using older versions of R you can use the following code# model <- mgcv::gam(# formula = list(mu_formula, sigma_v_formula, sigma_u_formula),# data = LDS_wheat_public_cleaned, family = normhnorm(s = s), optimizer = c("efs")# )# Eth_model_dsfa<-dsfa(formula=list(mu_formula, sigma_v_formula, sigma_u_formula),# data=data_final, family=comper(s=s, distr="normhnorm"), optimizer = c("efs"))# summary(Eth_model_dsfa)# plot(Eth_model_dsfa, select = 1) # Estimated function# plot(Eth_model_dsfa, select = 2) ```# Discussion: Yield gap analysis as a prioritization frameworkWe turn next to discuss how yield gap analysis using stochastic frontier analysis can be used as a prioritization framework. Similar ideas can be extended to account for bad outputs like GHG and multiple outputs.```{r}```# Conclusion# ReferencesLiu, Y., and Myers, R. 2009. "Model selection in stochastic frontier analysis with an application to maize production in Kenya". Journal of Productivity Analysis 31: 33-46. Doi: https://doi.org/10.1007/s11123-008-0111-9.Olsen, J.V., and Henningsen, A.2011. "Investment Utilisation, Adjustment Costs, and Technical Efficiency in Danish Pig Farms." FOI Working Paper, No. 2011/13, University of Copenhagen, Department of Food and Resource Economics (IFRO),Copenhagen.Pross, C., Strumann, C., Geissler, A., Herwatz, H., and Klein, N. 2018. "Quality and resource efficiency in hospital service provision: A geoadditive stochastic frontier analysis of stroke quality of care in Germany". PLOS One. Doi:https://doi.org/10.1371/journal.pone.0203017.